Off-Policy Monte Carlo Prediction with Importance Sampling
How importance sampling lets us estimate value functions under a target policy using episodes collected by a different behavior policy.
Content tagged with "reinforcement learning"
How importance sampling lets us estimate value functions under a target policy using episodes collected by a different behavior policy.